Approximate Nearest Neighbor Search on High Dimensional Data - Experiments, Analyses, and Improvement (v1.0)
نویسندگان
چکیده
Approximate Nearest neighbor search (ANNS) is fundamental and essential operation in applications from many domains , such as databases, machine learning, multimedia, and computer vision. Although many algorithms have been continuously proposed in the literature in the above domains each year, there is no comprehensive evaluation and analysis of their performances. In this paper, we conduct a comprehensive experimental evaluation of many state-of-the-art methods for approximate nearest neighbor search. Our study (1) is cross-disciplinary (i.e., including 16 algorithms in different domains, and from practitioners) and (2) has evaluated a diverse range of settings , including 20 datasets, several evaluation metrics, and different query workloads. The experimental results are carefully reported and analyzed to understand the performance results. Furthermore, we propose a new method that achieves both high query efficiency and high recall empirically on majority of the datasets under a wide range of settings.
منابع مشابه
Exact and Approximate Reverse Nearest Neighbor Search for Multimedia Data
Reverse nearest neighbor queries are useful in identifying objects that are of significant influence or importance. Existing methods either rely on pre-computation of nearest neighbor distances, do not scale well with high dimensionality, or do not produce exact solutions. In this work we motivate and investigate the problem of reverse nearest neighbor search on high dimensional, multimedia dat...
متن کاملHDIdx: High-dimensional indexing for efficient approximate nearest neighbor search
Fast Nearest Neighbor (NN) search is a fundamental challenge in large-scale data processing and analytics, particularly for analyzing multimedia contents which are often of high dimensionality. Instead of using exact NN search, extensive research efforts have been focusing on approximate NN search algorithms. In this work, we present “HDIdx”, an efficient high-dimensional indexing library for f...
متن کاملFast Approximate Nearest Neighbors with Automatic Algorithm Configuration
For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high-dimensional problems that are faster than linear search. Approximate algorithms are known to provide large speedups with only minor loss in accuracy, but many such algorithms have been published with onl...
متن کاملApproximate Nearest Line Search in High Dimensions
We consider the Approximate Nearest Line Search (NLS) problem. Given a set L of N lines in the high dimensional Euclidean space R, the goal is to build a data structure that, given a query point q ∈ R, reports a line ` ∈ L such that its distance to the query is within (1+ ) factor of the distance of the closest line to the query point q. The problem is a natural generalization of the well-studi...
متن کاملAn Investigation of Practical Approximate Nearest Neighbor Algorithms
This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hashing (LSH). In this paper we ask the question: c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1610.02455 شماره
صفحات -
تاریخ انتشار 2016